Architectures for Deep Web Data Extraction and Integration
نویسندگان
چکیده
Deep Web, as a rich and largely unexplored data source, is becoming nowadays an important research topic. In previous years, data extraction from Web pages has received a lot of attention. Much experience has been also already accumulated in the area of traditional, relational databases integration. Today, these research areas converge, leading to development of systems for Deep Web data extraction and integration. Several approaches were proposed and many systems were built that enable extraction and integration of Deep Web data. In the paper we propose a classification framework allowing to compare different approaches based on the full model of data extraction and integration process. We classify the most important systems reported in the literature according to the proposed framework; they are evaluated with respect to their capabilities and coverage of the process. We conclude with the refinement of the architecture that would cover the complete data extraction and integration process for Web sources.
منابع مشابه
Integration of Deep Learning Algorithms and Bilateral Filters with the Purpose of Building Extraction from Mono Optical Aerial Imagery
The problem of extracting the building from mono optical aerial imagery with high spatial resolution is always considered as an important challenge to prepare the maps. The goal of the current research is to take advantage of the semantic segmentation of mono optical aerial imagery to extract the building which is realized based on the combination of deep convolutional neural networks (DCNN) an...
متن کاملOn Social Network Web Sites: Definition, Features, Architectures and Analysis Tools
Development and usage of online social networking web sites are growing rapidly. Millions members of these web sites publicly articulate mutual "friendship" relations and share user-created contents, such as photos, videos, files, and blogs. The advances in web designing technology and fast growing usage of online resources prompted web designers to improve features and architectures of social ...
متن کاملOn Social Network Web Sites: Definition, Features, Architectures and Analysis Tools
Development and usage of online social networking web sites are growing rapidly. Millions members of these web sites publicly articulate mutual "friendship" relations and share user-created contents, such as photos, videos, files, and blogs. The advances in web designing technology and fast growing usage of online resources prompted web designers to improve features and architectures of social ...
متن کاملInformation Discovery, Extraction and Integration for the Hidden Web
In this paper, we report our initial investigations on the problems of automatically extracting data objects from a given hidden-web source (i.e., the web site with an HTML search form) and automatically assigning semantics to the extracted data. We also propose some future work to address the problem of information discovery and integration for hidden-web sources.
متن کاملEfficient Web Data Mining with Standard XML Technologies
The problem of Web data extraction and XML-based methodology whose goal extends far beyond simple “screen scraping are discussed.” An ideal data extraction process is able to digest target Web databases that are visible only as HTML pages, and create a local, identical replica of those databases as a result. What is needed in this process is much more than a Web crawler and set of Web site wrap...
متن کامل